This assignment attempts to solve the 2021 IEEE Visual Analytics Science and Technology (VAST) Challenge: Mini-Challenge 2 by applying different visual analytics concepts, methods, and techniques with relevant R data visualisation and data analysis packages.
Given the data sources provided, identify potential informal or unofficial relationships among GASTech personnel. Provide evidence for these relationships.
Similar to question 3, identify the POIs by computing the difference of gps timestamp.
Afterwards, identify who are within ‘close contact’ of each employee based on the difference of their gps coordinates within the same time period.
This can help establish the relationship of GASTech personnel according to their meetings at the same place and at the same time.
gps_poi_network <- car_gps_data %>%
group_by(CarID) %>%
mutate(poi_diff = timestamp - lag(timestamp, order_by=CarID)) %>%
mutate(poi = if_else(poi_diff > 60*5, TRUE, FALSE)) %>%
filter(poi == TRUE) %>%
ungroup() %>%
mutate(lat_diff = lat - lag(lat, order_by=timestamp))%>%
mutate(long_diff = long - lag(long, order_by=timestamp)) %>%
mutate(close_contact = if_else(abs(lat_diff) <=0.001 & abs(long_diff) <=0.001, TRUE, FALSE))%>%
filter(close_contact == TRUE) %>%
ungroup()
glimpse(gps_poi_network)
Rows: 773
Columns: 15
$ timestamp <dttm> 2014-01-06 07:34:01, 2014-01-06 07:44:01, 201~
$ CarID <fct> 10, 12, 8, 13, 30, 22, 16, 107, 33, 10, 20, 19~
$ lat <dbl> 36.07333, 36.06365, 36.06365, 36.05408, 36.054~
$ long <dbl> 24.86418, 24.88593, 24.88594, 24.90125, 24.901~
$ date <dttm> 2014-01-06, 2014-01-06, 2014-01-06, 2014-01-0~
$ day <ord> Mon, Mon, Mon, Mon, Mon, Mon, Mon, Mon, Mon, M~
$ hour <int> 7, 7, 7, 8, 8, 8, 11, 11, 11, 11, 11, 11, 12, ~
$ Deparment <chr> "Executive", "Security", "Information Technolo~
$ Title <chr> "SVP/CIO", "Site Control", "IT Technician", "S~
$ FullName <chr> "Ada Campo-Corrente", "Hideki Cocinaro", "Luca~
$ poi_diff <drtn> 1980 secs, 1965 secs, 1307 secs, 2005 secs, 1~
$ poi <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE~
$ lat_diff <dbl> -1.833e-05, -5.626e-05, -6.738e-05, -2.531e-05~
$ long_diff <dbl> 7.720e-06, 3.232e-05, 1.306e-05, 2.260e-06, -6~
$ close_contact <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE~
employee_edges <- gps_poi_network %>%
group_by(date, hour)%>%
mutate(from = FullName) %>%
mutate(to = lead(FullName, order_by = timestamp)) %>%
ungroup() %>%
group_by(from,to) %>%
summarise(weight = n())
employee_nodes <- gps_poi_network %>%
select(FullName, Deparment, Title) %>%
rename(id = FullName) %>%
rename(group = Deparment) %>%
distinct()
visNetwork(employee_nodes,
employee_edges,
main = "Relationships among GASTech Personnel") %>%
visIgraphLayout(layout = "layout_with_fr") %>%
visEdges(arrows = "to",
smooth = list(enabled = TRUE,
type = "curvedCW")) %>%
visOptions(highlightNearest = TRUE,
nodesIdSelection = TRUE) %>%
visLegend() %>%
visLayout(randomSeed = 123)
The network diagram shows the ‘official’ relationship of employees based on their respective departments. It also show ‘unofficial’ relationship based on the number of their interactions.
From the network diagram, it can be seen that Isande Barrasca , a Drill Technician from the Engineering Department is an outlier. His only close contact to the rest of employees is Hideki Cocinaro , a Site Controller from the Security Department.
Similarly, Sten Sanjorge Jr. , IT Technician from the Information Technology Department, have minimal interactions with other employees and seems not well connected within the company.
The heatmap below visualizes the number of interactions between employees.
employee_interact1 <- full_join(employee_edges,
employee_nodes,
by = (c("from" = "id"))) %>%
rename(SenderDepartment = group) %>%
rename(SenderTitle = Title)
employee_interact2 <- full_join(employee_interact1,
employee_nodes,
by = (c("to" = "id"))) %>%
rename(ReceiverDepartment = group) %>%
rename(ReceiverTitle = Title) %>%
rename(Sender = from) %>%
rename(Receiver = to)
employee_interaction <- ggplot(data = employee_interact2,
aes(x=Sender, y=Receiver,
fill = weight,
text = paste("Sender :", Sender,"\n",
"Sender Department:", SenderDepartment, "\n",
"Sender Title:", SenderTitle, "\n",
"\n",
"Receiver", Receiver,"\n",
"Receiver Department:", ReceiverDepartment, "\n",
"Receiver Title:", ReceiverTitle, "\n",
"\n",
"Number of Meetings", weight))) +
geom_tile()+
scale_fill_gradient(low = "lightsteelblue1", high = "royalblue4") +
ggtitle("GAStech Personnel Relationship based on number of Interactions") +
labs(x = "Sender Employee", y = "Receiver Employee") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 90))
ggplotly(employee_interaction, tooltip = "text")
The highest number of meetings among GAStech Personnel are truck drivers with 23 interactions. Employee names are set to NA since the CarID is not identified. The second highest number of meetings is from Bertrand Ovan, Group Manager of Facilities department with 14 meetings.
The third highest number of meetings is from Ingrid Barranco, SVP/CFO from Executive department with 12 meetings.
Do you see evidence of suspicious activity? Identify 1- 10 locations where you believe the suspicious activity is occurring, and why.
Building on the POI network with employee interactions from Question 4, covert the gps coordinates to simple feature and plot it in the tourist map.
gps_poi_network_sf <- st_as_sf(gps_poi_network,
coords = c("long", "lat"),
crs= 4326)
gps_poi_network_sf
Simple feature collection with 773 features and 13 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 24.85088 ymin: 36.04802 xmax: 24.90814 ymax: 36.08962
Geodetic CRS: WGS 84
# A tibble: 773 x 14
timestamp CarID date day hour Deparment
* <dttm> <fct> <dttm> <ord> <int> <chr>
1 2014-01-06 07:34:01 10 2014-01-06 00:00:00 Mon 7 Executive
2 2014-01-06 07:44:01 12 2014-01-06 00:00:00 Mon 7 Security
3 2014-01-06 07:59:01 8 2014-01-06 00:00:00 Mon 7 Informat~
4 2014-01-06 08:03:01 13 2014-01-06 00:00:00 Mon 8 Security
5 2014-01-06 08:14:01 30 2014-01-06 00:00:00 Mon 8 Security
6 2014-01-06 08:17:01 22 2014-01-06 00:00:00 Mon 8 Security
7 2014-01-06 11:46:01 16 2014-01-06 00:00:00 Mon 11 Security
8 2014-01-06 11:46:01 107 2014-01-06 00:00:00 Mon 11 <NA>
9 2014-01-06 11:47:01 33 2014-01-06 00:00:00 Mon 11 Engineer~
10 2014-01-06 11:52:01 10 2014-01-06 00:00:00 Mon 11 Executive
# ... with 763 more rows, and 8 more variables: Title <chr>,
# FullName <chr>, poi_diff <drtn>, poi <lgl>, lat_diff <dbl>,
# long_diff <dbl>, close_contact <lgl>, geometry <POINT [°]>
gps_poi_network_points <- gps_poi_network_sf %>%
select(timestamp,
CarID,
Deparment,
Title,
FullName,
date,
hour)
tmap_mode("view")
tm_shape(bgmap) +
tm_rgb(bgmap, r = 1,g = 2,b = 3,
alpha = NA,
saturation = 1,
interpolate = TRUE,
max.value = 255) +
tm_shape(gps_poi_network_points) +
tm_dots(col = 'red', border.col = 'black', size = 1, alpha = 0.5, jitter = .8) +
tm_facets(by = "date", ncol = 1)
1. Frydos Autosupply n More
This is the suspicious place because of the 10,000 spent on 2014-01-13. Additionally, members of the Security department frequently visit this place:


2. Spetsons Park
On January 07, 2014, 3:25, Isia Vann visited this place which is very unusual especially in the wee hours of the morning.

3. CEO’s house
On January 10, 2014, 23:23, Axel Calzas visited the place where the CEO is residing, he was followed by Kanon Herrero at arond 23:33 After a few hours, Felix Balas can also be seen around the vicinity On January 11, 2014, 00:25.
Photo Evidences|


4. Chostus Hotel
On January 08, 2014, around 13:00, both Brand Tempestad and Elsa Orilla were around the vicinity of hotel. This is unusual because it is still office hours on a weekday and they were in a hotel.


5. Warehouse near Sannan Park
On January 10, 2014, 22:20, Minke Mies visited this place. He also frequently visits the location around the Frydos Autosupply n More. 
This assignment attempts to solve the 2021 VAST Challenge: Mini-Challenge 2 by applying different visual analytics concepts, methods, and techniques.
The interactive bar chart was used to identity the most popular locations which is Katerina’s Cafe while interactive heatmap was used to determine the day and time when GAStech employees visit the place. The interactive boxplot was used to perform initial analysis of outliers while plot_anomaly_diagnostics function was used to diagnose unusual purchases particularly the 10,000 transaction in Frydos Autosupply n More.
Interactive heatmap was also used to assess the anomalies where it shows the transactions with the missing credit card and loyalty card data. Adding the gps and car data and plotting the movement path using tmap, 4 employees were identified who may be involved in the suspicious transactions in Frydos Autosupply n More.
An approach was proposed to determine the owners of the loyalty and credit card data. It involves mapping the credit card transaction purchases timestamp against the interactive ‘Point of Interest’ map.
Similar to POI, relationship among the GASTech personnel was establish based on their ‘close contact’ with each other where they are meeting at the same place and at the same time. An interactive network graph and heatmap were used to show the GAStech personel relationships based on the number of their interactions.
Synthesizing the information from all the questions 1 to 4 and using interactive POI maps, several locations where identified to be the place where suspicious activities are happening.
Using relevant R data visualisation and data analysis packages, the previous submissions from 2014 VAST Challenge were enhanced by adding interactive features and making the visualisation reproducible.
Finally, this assignment can still be further improved by using RShiny App and have a more friendly user interface to perform the investigation.
VAST Challenge 2014: MC2 - Patterns of Life Analysis Benchmark Repository
Whiting, Mark & Cook, Kristin & Grinstein, Georges & Liggett, Kristen & Cooper, Michael & Fallon, John & Morin, Marc. (2014). VAST challenge 2014: The Kronos incident
For attribution, please cite this work as
Dolit (2021, July 25). Visual Analytics & Applications: Visual Detective Assignment Part 4. Retrieved from https://adolit-vaa.netlify.app/posts/2021-07-26-assignment-4/
BibTeX citation
@misc{dolit2021visual,
author = {Dolit, Archie},
title = {Visual Analytics & Applications: Visual Detective Assignment Part 4},
url = {https://adolit-vaa.netlify.app/posts/2021-07-26-assignment-4/},
year = {2021}
}